Metabarcoding and Metagenomics
● Pensoft Publishers
Preprints posted in the last 90 days, ranked by how well they match Metabarcoding and Metagenomics's content profile, based on 12 papers previously published here. The average preprint has a 0.00% match score for this journal, so anything above that is already an above-average fit.
Duarte, S.; Costa, F.
Show abstract
Early detection and monitoring of non-indigenous species (NIS) is crucial to prevent their establishment and to reduce ecological and economic impacts in coastal ecosystems. Traditional monitoring approaches, which rely largely on morphological identification of collected organisms, are often time-consuming and may fail to detect species that occur at low abundance, are morphologically cryptic, or are present in the form of inconspicuous life stages. DNA-based approaches, particularly those resorting to environmental DNA, have demonstrated high aptitude for biodiversity monitoring and biosecurity surveillance. By examining the genetic material from bulk community samples or released into the environment, DNA-based approaches enable the detection of species without the need for direct observation, thereby increasing detection sensitivity and expanding the scope of monitoring programs. Despite the rapid growth of its employment in marine monitoring, a global synthesis of the status and trends of DNA-based approaches for detecting NIS in this environment has been lacking. Here, we present such synthesis, based on 146 published studies employing DNA for NIS detections in coastal environments. Two main methodological approaches were used across the reviewed studies, namely DNA metabarcoding which was applied in 49% of studies, closely followed by targeted single-species PCR assays, used in 42% of the studies. A smaller proportion of studies (10%) combined both approaches, integrating broad community screening with targeted detection to improve surveillance efficiency. Globally, 752 NIS were detected across disparate taxonomic groups, with metazoans representing the largest proportion of detections (464 species), followed by Chromista (210 species) and Plantae (77 species). Among these, the most frequently detected taxonomic groups included Dinophyceae (Dinoflagellata), Teleostei (Chordata), Florideophyceae (Rodophyta), Polychaeta (Annelida), Copepoda and Malacostraca (Arthropoda), and Ascidiacea (Chordata). At the species level, several well-known marine invaders were recurrently reported, including Bugula neritina (Linnaeus, 1758), Styela plicata (Lesueur, 1823), Acartia (Acanthacartia) tonsa Dana, 1849-1852, and Botryllus schlosseri (Pallas, 1766), highlighting the ability of DNA approaches to detect widespread and established invaders across different regions. The mitochondrial cytochrome c oxidase subunit I (COI) gene was the most widely used genetic marker, reflecting its broad taxonomic coverage and extensive representation in reference databases, particularly for targeting Metazoa. Ribosomal RNA genes, particularly 18S and 16S rRNA gene markers, were also frequently employed to target a wider range of eukaryotic taxa. Regarding sampled substrates, water was by far the most analyzed substrate, followed by zooplankton and biofouling communities collected from man-made structures. Notably, approximately 31% of all NIS detections reported in the reviewed studies constituted new regional records. These results highlight the potential of eDNA for coastal monitoring but also underline important limitations. Persistent geographical, taxonomic, and methodological biases can affect detection outcomes, and reliance on single sample types or markers may increase false negatives - particularly critical for NIS early detection. Therefore, multi-marker and multi-substrate approaches are essential to improve detection reliability and support effective biosecurity strategies. As reference databases continue to expand and methodological protocols become increasingly standardized, DNA-based monitoring is likely to play a central role in future management and surveillance of biological invasions in coastal ecosystems. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=133 SRC="FIGDIR/small/722998v1_ufig1.gif" ALT="Figure 1"> View larger version (75K): org.highwire.dtl.DTLVardef@17948b1org.highwire.dtl.DTLVardef@193832dorg.highwire.dtl.DTLVardef@189033dorg.highwire.dtl.DTLVardef@33cddf_HPS_FORMAT_FIGEXP M_FIG C_FIG
Fredrick Onyango, O.; Okello, J. A.; Muchiri, Z.; Mwamburi, S. M.; Labatt, C.; Owiro, E. O.; Cherono, S.
Show abstract
Assessing and monitoring biodiversity in mangrove ecosystems remains challenging, with most studies relying on proxy indicators to infer biodiversity status. This limit understanding of biodiversity dynamics and constrains evidence-based mangrove management. In the Western Indian Ocean region, biodiversity assessments in mangrove forests remain scanty, with no clear information on spatiotemporal and taxonomic coverage. Addressing these gaps requires examining existing biodiversity records and exploring complementary approaches that can broaden the scope and efficiency of biodiversity monitoring. This study assessed the current state of biodiversity assessments in mangrove forests in Kenya and evaluated the feasibility of environmental DNA (eDNA) as a complementary biodiversity monitoring tool. A systematic literature review was conducted by retrieving published sources from major academic databases using defined search terms to extract and compile taxonomic information. In addition, a snapshot eDNA survey was carried out in selected mangrove forests, where sediment and water samples were collected, processed, and analyzed using established molecular and bioinformatics pipelines. The literature review identified 26 sources documenting biodiversity across 15 mangrove forest areas, with 68% of the studies concentrated in four sites representing about 6% of mangrove cover in Kenya. A total of 1,044 unique taxa belonging to 255 families were identified, with the classes Teleostei, Aves, Chromadorea, and Malacostraca accounting for 84.5% of documented taxa. The eDNA survey detected heterogeneous taxa from multiple ecosystems, including 502 taxa belonging to 305 families. Only 67 families were common to both datasets, highlighting the complementarity of literature-based inventories and eDNA detection. While eDNA showed considerable potential to expand biodiversity detection, its application is constrained by a number of factors. Integrating eDNA as a core biodiversity monitoring tool in mangroves will require combining conventional surveys with molecular tools, developing curated regional DNA reference databases, and adopting standardized analytical frameworks.
Tedersoo, L.; Prous, M.; Chen, M.; Anslan, S.; Saar, I.; Dubois, B.; Mikryukov, V.
Show abstract
Metabarcoding is a powerful tool for biodiversity comparisons, where standard-size DNA barcodes (>500 bases) offer better taxonomic resolution than shorter ones. Still, the choice of sequencing platforms and bioinformatics pipelines may strongly affect inferred diversity due to various technical biases. We assessed the relative performance of Illumina MiSeq i100 (2x500 paired-end), PacBio Revio and Oxford Nanopore MinION sequencing and bioinformatics pipelines, using full-length ITS amplicon sequencing datasets from a 103-species mock community and 45 composite soil samples. Despite numerous low-quality reads, PacBio yielded the lowest overall error rate and highest number of taxa. Illumina revealed the highest proportion of chimeric and index-switched reads, along with a strong bias towards shorter amplicons. MinION data analysed using PRONAME and Minovar - a bioinformatics pipeline presented here - had the largest proportion of low-quality data, and rare taxa were lost during data filtering and read polishing steps. Although Minovar enabled amplicon sequence variant (ASV) level precision for common taxa, we recommend clustering ASVs into OTUs. For PacBio, standard filtering approaches outperformed the ASV approach because they retained rare taxa. For Illumina, a stringent ASV approach or removal of rare OTUs would limit artefacts. Across all platforms, excess PCR cycles promoted chimeric and low-quality reads and lost quantitativity in biodiversity assessments. With moderate differences in effect sizes, all analytical approaches supported the conclusion that sampling design determines how we see soil biodiversity responses to land use. For biodiversity surveys based on the full-length ITS metabarcoding, we recommend using PacBio sequencing with standard, non-ASV pipelines.
Mauvisseau, Q.; Ewer, I.; Blumeris, I.; Iren Bongo, S.; Filipe Brito de Oliveira, L.; Gouvea, B.; Carolina Cei, A.; Ferreira Rodrigues, K.; de Arruda Francisco, J.; Sletteng Garvang, E.; Marena do Rego Henriques, V.; Hurtado Solano, S.; Kvalheim, L.; Kaylynne Lawrence, S.; Ramalho Maciel, B.; Isanda Masaki, H.; Fortunate Mashaphu, M.; Masimula, L.; Prudent Mokgokong, S.; Katrin Onshuus, E.; Lima Paiva, B.; Parker-Allie, F.; Du Plessis, M.; Puzicha, M.; Gabriel Da Silva Solano Reis, O.; Speelman, G.; Moritz Splitthof, W.; Stocco de Lima, A. C.; Strindberg, H.; Smoge Saevik, O.; Tafjord, N. J. D
Show abstract
Environmental DNA metabarcoding is a powerful monitoring tool for assessing aquatic biodiversity, as well as the sustainability and impacts of fisheries and aquaculture. However, conventional laboratory workflows remain time-consuming and dependent on dedicated infrastructures. Here, we present a field trial of a fully portable, off-grid eDNA metabarcoding pipeline that enables end-to-end analysis within a few days using compact equipment, including a BentoLab workstation and an Oxford Nanopore Technologies (ONT) MinION sequencer. The workflow was implemented during two international training courses in Norway and Brazil, where students and early career researchers collected environmental samples, extracted and amplified DNA, prepared DNA libraries, and sequenced on-site before performing bioinformatics and statistical analyses. In the case study detailed here, seven eDNA samples collected and analysed on-site in the Oslofjord allowed detection of 16 fish and elasmobranch species. Although overall diversity was lower than in earlier studies using Illumina-based sequencing, our protocol reliably detected key species and demonstrates that portable eDNA metabarcoding is feasible for rapid ecological assessment, surveillance of high-risk regions and/or deployment in remote or resourcelZllimited settings.
Roussel, J.-M.; Quemere, E.; Bonnet, B.; Covain, R.; Dezerald, O.; Lassalle, G.; Le Bail, P.-Y.; Petit, E. J.; Pottier, G.; Quartarollo, G.; Vigouroux, R.; Lalague, H.
Show abstract
O_LIEnvironmental DNA (eDNA) metabarcoding of water samples is increasingly used to detect fish species in streams. Several studies have concluded that it can outperform traditional inventory methods and recommend using it at large scales for fish-based ecological assessments. However, there is no standard protocol that can guarantee sufficient detection rates and repeatability, despite companies offering an extensive range of analyses. C_LIO_LIWe compared eDNA metabarcoding performed by four companies. Following their guidelines, samples were collected in a small tropical stream in the Maroni River (French Guiana) that hosts a species-rich fish community. We compared their inventories to each other and to a list of species captured during an extensive fish inventory performed immediately after sampling eDNA, as well as to current data on the species distributions. C_LIO_LIThe number of species detected by eDNA metabarcoding ranged from 5 to 48 among the companies, but these inventories contained many inaccuracies. All companies combined, 63 species were detected, of which 10 (16%) had never been reported in the Maroni River. The extensive inventory identified 50 species in the local fish community, of which 16-46 were not detected by eDNA metabarcoding (i.e. false negative detection rate of 32%-92% among the companies). C_LIO_LIReanalysis of raw sequencing data decreased differences among companies greatly, highlighting the importance of using a comprehensive and accurate DNA barcode database to assign species. Dissimilarity indices, calculated to compare the local fish community (based on presence/absence or fish catches) to eDNA detection, revealed large differences regardless of the company. C_LIO_LISummary and applications. The large percentage of species not detected by eDNA metabarcoding of water samples could strongly bias fish-diversity inventories in streams that host species-rich communities. This issue is not well documented in the literature, and we recommend that similar studies in the future focus on other stream contexts. The large differences between commercial eDNA inventories and the local fish community challenge the use of eDNA metabarcoding for fish-based ecological assessments of streams. The variable performance of eDNA companies indicates the need for a standard protocol and access to a comprehensive DNA database before beginning large-scale eDNA programmes. C_LI Highlights- eDNA metabarcoding of water samples is widely used to detect species in streams - Detection performances of 4 private companies were compared to an exhaustive fish inventory - The number of undetected species varies from 32 to 92% depending on the company - Such discrepancies challenge the use of eDNA for fish-based ecological assessments
Marquez, E. J.; Garcia-Castro, K. L.; Alvarez, D. R.; DoNascimiento, C.
Show abstract
Astyanax Baird & Girard, 1854 is a widely distributed and species-rich genus of Acestrorhamphidae, whose abundant populations in Neotropical basins play a crucial ecological role at the trophic level. Taxonomic uncertainties persist within the genus, as seen in Astyanax sp. (formerly designated as A. fasciatus) from the Magdalena basin in Colombia. Concerns about its genetic status are heightened due to ecological threats posed by hydroelectric dams, from habitat loss to river connectivity. We isolated and characterized 17 microsatellite loci to assess the population genetics of this species in a broad sample from the middle and lower sections of the Cauca River, now interrupted by the Ituango dam. Furthermore, a multidisciplinary approach integrating phylogenetic analyses of mitochondrial (COI) and nuclear (rag2) markers with geometric morphometric analyses was employed to evaluate potential cryptic diversity within Astyanax sp. Microsatellites revealed two genetic groups in the studied area, strongly supported as distinct lineages by phylogenetic analyses. Unexpectedly, one of these lineages of Astyanax sp. was recovered in an unresolved clade with samples of A. microlepis and allopatric samples of A. viejita from the Maracaibo Lake basin. Each genetic group showed high genetic diversity, but also evidence of recent bottleneck events and significant-high values of inbreeding. Morphometric analyses provided evidence of significant phenotypic differentiation among A. microlepis, Astyanax sp. 1 (Asp1), and Astyanax sp. 2 (Asp2). Morphological patterns ranged from the robust profile of A. microlepis to the streamlined shape of Astyanax sp. 2 (Asp2), with Astyanax sp. 1 (Asp1) displaying intermediate traits and localized differences in head length and fin placement. Statistical support from permutation tests and a high overall classification accuracy (95.65%) underscore the existence of distinct morphospecies, suggesting that phenotypic differentiation is well-established, despite the complex evolutionary history of the group. This study suggests the presence of cryptic diversity within Astyanax sp. and provides valuable genetic information for the conservation and management of their populations in the Magdalena basin.
Hanfling, B.; Griffiths, N. P.; Macarthur, J. A.; Morrisey, B.; Svobodova, D.; Pritchard, V. L.; Tree, A.; Gaywood, M. J.
Show abstract
O_LIEnvironmental DNA (eDNA) metabarcoding is an emerging tool for biodiversity assessment in freshwater systems, offering high-resolution insights into community composition. Here, we apply eDNA metabarcoding to evaluate the ecological impacts of Eurasian beaver (Castor fiber) activity within a seminatural enclosure in the Scottish Highlands. C_LIO_LIWe collected seasonal water samples from nine sites, six influenced by beaver dams and three control sites with no evidence of beaver engineering, across a 40-hectare enclosure. Samples were analysed for vertebrate and macroinvertebrate diversity using established 12S and COI markers. C_LIO_LIVertebrate alpha diversity did not differ significantly between beaver and control sites, likely reflecting the small spatial scale and low species richness of upland Scottish streams. However, community composition differed significantly between treatments, especially for fish (PERMANOVA, R2 = 0.55, P < 0.001), with beaver-influenced sites dominated by three-spined stickleback and control sites by brown trout. Macroinvertebrate communities showed a 78% increase in gamma diversity in beaver-modified habitats relative to controls. Species composition varied strongly with beaver presence (PERMANOVA, R2 = 0.29, P < 0.001), likely due to the creation of lentic-lotic mosaics and associated microhabitat diversity. Seasonal variation was significant in both taxonomic groups, with the lowest species richness and highest community dispersion observed in summer, probably reflecting hydrological and temperature-driven dynamics in eDNA production and transport. C_LIO_LIOur findings reinforce previous evidence that beaver dam-building activity enhances beta diversity in headwater systems. Additionally, we demonstrate that eDNA metabarcoding is a sensitive method for detecting spatial patterns in freshwater biodiversity associated with these activities at scales ranging from tens to hundreds of meters. These approaches could inform future monitoring strategies aligned with landscape-scale beaver management and reintroductions. C_LI
Castillo, A. H.; Jacobs, S.; Steinke, D.; Smith, M. A.
Show abstract
Leaf litter ecosystems and their fauna are largely understudied, despite their critical ecological roles. Here, we investigate challenges associated with estimating biodiversity in terrestrial leaf litter. Current methodologies for biodiversity assessment are fraught with limitations, amongst the most significant is a decline in taxonomic expertise, complicating the process of species identification and the significant costs associated with species-level morphological identifications. DNA barcoding employs the mitochondrial gene cytochrome c oxidase I (COI) to identify animal species, and DNA metabarcoding facilitates the identification of multiple species without necessitating taxonomic expertise. Recent studies indicate that environmental DNA (eDNA) may exhibit greater sensitivity compared to traditional methods. To test whether these methods work in a real-world application, we sampled leaf litter across a temperate forest/field ecotone. Leaf litter was dried, ground and processed to extract environmental DNA. We evaluated the DNA extraction protocols to test their relative efficacy. We found that the Qiagen Blood and Tissue Kit was the most effective at recovering invertebrate diversity and that there were notable differences in biodiversity between forest and field habitats. Temperature emerged as a significant factor influencing the composition of the communities observed. Our methodology is applicable across various environments for efficient biodiversity assessment and might be particularly beneficial for monitoring pests and invasive species. Our approach offers a cost-effective and timely alternative to conventional biodiversity assessment methods and underscores the significance of accurate assessment methodologies for leaf litter communities.
Wolany, L.; Klinkenborg, K.; Leese, F.; Buchner, D.
Show abstract
DNA metabarcoding is a central tool in biodiversity research and monitoring, producing detailed taxa lists with comparatively little time and effort. One of its limitations, however, is the lack of quantitative data on biomass or abundance. This limitation has two main reasons: 1) template copy number variation and 2) primer-induced amplification bias. Many metabarcoding markers are mitochondrial and mitochondrial copy numbers vary in animal tissues, potentially decoupling sequence counts from biomass. Additionally, primer mismatches can lead to taxon-specific amplification biases, for which PCR cycle calibration has been proposed as a solution. To mechanistically study both effects, we constructed mock communities of different arthropod species. We combined digital droplet PCR and COI metabarcoding to quantify relationships between biomass, mitochondrial copy number and metabarcoding reads. Mitochondrial DNA copy numbers per biomass varied strongly within and among the different taxa. Metabarcoding reads did not reflect input mitochondrial DNA copies without a correction. Attempts to correct for amplification bias via PCR cycle calibration failed as read proportions remained stable across cycles. We therefore mathematically derived an approach to estimate relative amplification bias and initial mitochondrial DNA copy numbers in a sample based on a non-exponential amplification bias model and demonstrate its applicability. Still, the detected high variation in mitochondrial copy numbers and derived prerequisites necessary to calculate amplification efficiencies and mitochondrial copy numbers limit the practical application. Our study highlights fundamental constraints of quantitative metabarcoding and underscores the need for additional methodological approaches for quantitative insights while delivering essential conceptual insights.
Scharf, S. A.; Spohr, P.; Ried, M. J.; Haas, R.; Klau, G. W.; Henrich, B.; Pfeffer, K.
Show abstract
Multiplexing samples in long-read sequencing with Oxford Nanopore Next Generation Sequencing Technology (ONT) by ligating specific native barcodes to individual DNA samples enables significant increases of high throughput sequencing combined with a significant reduction of sequencing costs. However, this advantage carries the risk of barcode misassignment / crosstalk. Employing ONT multiplex sequencing with samples, we observed misassigned barcodes so called barcode crosstalk, after ONT library preparation according to the standard protocol, particularly in samples with low input DNA concentrations. We assumed that these barcode misassignments are largely due to misligation of remaining native barcodes during subsequent the subsequent sequencing adapter ligation. To systematically investigate and quantify barcode crosstalk, genomic DNA (gDNA) from four bacterial type strains with different DNA input concentrations was prepared using three protocols for library preparation: the Nanopore standard protocol (protocol A: version valid until July 2, 2025) the new Nanopore protocol (protocol B: version from July 2, 2025), and an in house protocol with pooling of the barcoded samples only after the sequencing adapter ligation step (protocol C: in house). All samples were sequenced on a Nanopore PromethIon device. The results clearly showed that the use of protocol A resulted in a pronounced barcode crosstalk especially detectable in samples with low DNA input concentrations (up to 2.4% misassigned reads). The ONT adjustment in protocol B (altered washing buffer vs. protocol A) significantly alleviated the barcode crosstalk to below 0.01%, whereas protocol C eliminated barcode crosstalk virtually completely. These observations emphasize that sequencing results obtained with older ONT native barcoding protocol variants should be critically reviewed. The newer ONT barcoding protocol is preferable for sequencing, but it does not completely eliminate the barcode crosstalk effect. In conclusion, for low DNA input and high accuracy sequencing, protocol C is recommended.
Monaghan, A. I. T.; Sellers, G. S.; Griffiths, N. P.; Lawson Handley, L.; Hänfling, B.; Macarthur, J. A.; Wright, R. M.; Bolland, J. D.
Show abstract
Effective monitoring of the critically endangered European eel (Anguilla anguilla) is essential for conservation planning and regulatory decision-making, particularly in heavily fragmented rivers. Environmental DNA (eDNA) methods offer sensitive alternatives to traditional surveys, but there is uncertainty around whether targeted assays or community-wide approaches are better suited to achieve monitoring objectives. We compared eDNA metabarcoding and species-specific quantitative PCR (qPCR) for detecting A. anguilla across 145 pumped catchments in the Fens, East Anglia, England. All sites were sampled once initially, and sites negative for A. anguilla were re-sampled based on metabarcoding results. This allowed comparison of detection rates from a single water sample and site-level retrospective identification of sites where qPCR could have identified A. anguilla in earlier samples. The findings were also set in the context of the wider biodiversity information generated by metabarcoding. From the initial (single) water sample, qPCR detected A. anguilla at seven more sites than metabarcoding (17 versus 10). With repeated sampling, metabarcoding detected A. anguilla at 43 sites, including all but one of the sites where qPCR detected A. anguilla, and ten sites where qPCR did not detect A. anguilla within the same number of samples. Indeed, the additional sampling effort required to detect A. anguilla with metabarcoding at sites also positive with qPCR was small relative to the overall sampling effort. Furthermore, metabarcoding additionally detected 28 non-target fish species alongside fish, amphibian and mammal species of conservation concern. Our results highlight trade-offs between target-species sensitivity and the broader ecological information provided by each method, and support metabarcoding as an effective tool for a holistic conservation approach, with the additional community data outweighing the marginally increased sensitivity of qPCR.
O'Brien, K.; Elamaran, A.; Dayi, M.; Keeling, G.; Nevin, W. D.; Liu, Y.; Viney, M.; Reynolds, K.; Bishop, C.; Sripa, B.; Woubshete, M.; Sachs Nique, P.; Wright, R.; Younger, J.; Hunt, V. L.
Show abstract
Soil-transmitted helminths (STHs) pose significant challenges to public health in endemic areas, necessitating reliable methods for their detection. Shotgun metagenomics enables simultaneous detection of STHs and microbes in a sample without prior knowledge of what is present. However, validation of shotgun metagenomics with known infection intensity or across different sequencing platforms has not been carried out for eukaryote parasites including STHs, and false positives remain a pervasive issue. We validated shotgun metagenomics as a method of STH detection in faecal samples. Using the Strongyloides ratti laboratory model of a STH infection we investigated how analytical methods (nucleotide-nucleotide matching, nucleotide-protein matching, marker gene detection, mitochondrial mapping), infection intensity and sequencing technology (short-read vs. long-read) affects sensitivity and specificity of detection. S. ratti was accurately detected at a standard laboratory dose, but low intensity infections were more difficult to detect. Only mitochondrial sequence mapping was 100% accurate at identifying S. ratti with no false positives. Overall, short-read outperformed long-read sequencing methods. We applied the same analytical methods to human faecal samples with confirmed infections for at least one of four STHs. Mitochondrial sequence mapping was also the most effective method for detecting STHs in human faecal samples, detecting 100% of Necator americanus and 92% of Ascaris spp. infections, but could not reliably detect STHs where DNA levels are expected to be low or variable. In conclusion, mitochondrial mapping was the most effective method of detection for sensitivity and specificity in both the laboratory system and human faecal samples. Our findings indicate that shotgun metagenomics should be approached cautiously using validated methods, particularly when infection intensity or DNA levels are expected to be low. Author SummarySoil-transmitted helminths (STH) such as the parasite Strongyloides, are important gastrointestinal parasites of humans and livestock. Accurate methods of detection for diagnostics and monitoring are important to implement suitable control and treatment strategies. Here we validate a shotgun metagenomics approach, where all DNA in a sample is sequenced, for detecting STH in faecal samples using a Strongyloides laboratory model for infection. Strongyloides was reliability detected in faecal samples at higher infection levels, but mitochondrial genome mapping of the sequences was the only analytical method that reliably detected Strongyloides at lower infections levels. These results were reflected in stool samples from humans infected with STH, where mitochondrial mapping was also the most reliable method. However, species that were associated with low levels of parasite material or DNA in the faeces including Strongyloides stercoralis, were more difficult to detect. We compared two sequencing methods: short-read Illumina and long-read Oxford Nanopore Technologies, but short-read outperformed long-read shotgun metagenomics. Contamination of bacteria sequences in parasite genome assemblies was problematic for analysis and contributed to false positive results. Future work should focus on specific targeting of eukaryote DNA either at the laboratory or bioinformatic stage to improve STH detection further.
Keene, D.; Arya, S.; Walker, B.; Laumer, C. E.
Show abstract
Molecular data have revolutionised taxonomic and ecological research on the hyperdiverse communities of aquatic benthic microinvertebrates known as meiofauna. However, reference sequence databases remain highly incomplete, with variable barcode genes or fragments studied from taxon to taxon. Furthermore, there is a typical tradeoff between universality of primers and phylogenetic resolution, with rRNA markers being robustly recoverable but failing to resolve species-level divergences, and mitochondrial markers showing the reverse trend. Here, we introduce Oxford Nanopore rRNA and COI amplicon sequencing (OrCa-seq), a rapid, low-cost protocol for parallel long-range PCR amplification and multiplexed sequencing of four amplicons, spanning the nearly-complete rRNA cistron ([~]7-8 kb) and the widely studied Folmer region of COI (represented as overlapping 313 and 658 bp amplicons). This protocol, with its associated bioinformatic workflow, was designed for conducting biodiversity inventories of meiofauna and can be easily carried out in field research and educational contexts, with data available from 96-well plates of specimens within a day of lysis. To validate the method, we processed six plates of student-isolated freshwater and limno-terrestrial meiofauna, characterising the recovery of target genes and taxa with both automated and human-curated BLAST database comparisons. These data demonstrate the universal applicability of OrCa-seq across effectively all meiofauna, including the very smallest species. Nonetheless, recovery efficiency for each amplicon shows variation by taxon, with the full-length Folmer COI amplicon standing out as the most challenging. We present exemplar phylogenetic trees integrating reference sequences, demonstrating the utility of these data in confirming morphological determinations and in identifying anonymous specimens in a reverse taxonomy context. While developed in a specific educational context for use on meiofauna, the OrCa-seq approach should be readily scalable to larger research datasets, adaptable to many specimen types, and to any combination of taxon-or target-specific primers. As such, it represents a compelling multi-locus extension to the ever-growing repertoire of nanopore DNA barcoding protocols.
Piovesan, A.; Praz, C.; Voelkl, B.; Lanz, S.; Neumann, P.; Beaureapaire, A.
Show abstract
Pollinator populations are facing worldwide declines, underscoring conservation needs. Yet, conservation assessments still mostly rely on occurrence data, often derived from heterogeneous and opportunistic observations. While such data can inform on species presence and distribution, they may overlook important markers of population declines. This is particularly problematic for social species such as bumble bees, which typically exhibit low effective population sizes despite high abundance of workers observed in the field. Despite these putative pitfalls, the relationship between occurrence-based and genetic-based estimates remains largely unexplored in social bees. We here investigated spatio-temporal genetic patterns in five Swiss Bombus species representing contrasting population trajectories over the last century: B. humilis and B. sylvarum (stable), B. ruderatus (increasing), B. pomorum (regionally extinct), and B. veteranus (declining). Museum specimens collected between 1929 and 2023 were genotyped at 11 microsatellite loci to compare spatio-temporal fluctuations in genetic diversity and population structure with occurrence data. Overall, multilocus heterozygosity and allelic richness remained stable in all species during the time period investigated, indicating that the diverging population trends did not result in substantial variation of genetic diversity. In contrast, strong and significant shifts in allelic frequencies between time periods were detected in three species, suggesting recent immigration events. Isolation by distance was detected in the cold-adapted B. veteranus, while the extant warm-adapted species (B. humilis, B. sylvarum, B. ruderatus) showed high levels of gene flow between locations. In B. pomorum, increasing genetic homogenization was observed before extinction. Altogether, these findings show that genetic diversity indexes are not the most adapted tools to monitor conservation status of social bee populations, and that estimates of population structure such as allelic shifts may be more informative. Moreover, these results highlight the importance of monitoring metapopulation dynamics and ensuring connectivity among populations to facilitate gene flow and enable demographic rescue processes.
Zeng, K.; Fodor, A. A.
Show abstract
BackgroundIn microbiome research, differential abundance analysis aids in identifying significant differences in microbial taxa across two or more conditions. Statistical approaches used for this purpose include classical tests such as the t-test and Wilcoxon test, as well as methods designed to account for the compositional nature of microbiome data, including ALDEx2, ANCOM-BC2, and metagenomeSeq. In addition, methods originally developed for RNA sequencing data, such as DESeq2 and edgeR, have been frequently applied to microbiome studies. However, the use of these methods has been controversial. One area of concern is whether different modeling frameworks produce accurate p-values when the null hypothesis is true. ResultsWe evaluated eight methods across six publicly available datasets. Four permutation strategies were applied to generate data under the null hypothesis: shuffling sample names, shuffling counts within samples, shuffling counts within taxa, and fully randomizing the counts table. Methods based on the negative binomial distribution (DESeq2 and edgeR) produced p-values that were consistently smaller than expected under the null hypothesis. In contrast, methods that attempt to correct for compositionality (ALDEx2, ANCOM-BC2, and metagenomeSeq) tended to produce larger-than-expected p-values, even when only sample labels were shuffled, a permutation strategy that does not alter compositional structure. These deviations were dependent on dataset characteristics and permutation strategy, suggesting complex interactions between underlying data structure and algorithm performance. Generating data to follow the expected negative binomial distribution did not eliminate the tendency of DESeq2 and edgeR to exaggerate statistical significance. Although similar patterns were observed in RNA sequencing (RNAseq) datasets, the deviations were less pronounced than in microbiome data. In contrast, the classical t-test and Wilcoxon test yielded p-value distributions consistent with theoretical expectations across datasets and permutation strategies. ConclusionsThese results indicate that the performance of several widely used differential abundance methods can be problematic under null conditions and may affect biological interpretation. Our findings emphasize the importance of careful method selection and highlight the robustness of simpler statistical approaches for reliable inference.
Rodriguez, L. K.; Schallhart, S.; Hobmeier, P.; Curran, T.; Perez-Jorge, S.; Prieto, R.; Oliveira, C.; Silva, M. A.; Thalinger, B.
Show abstract
O_LIEnvironmental DNA (eDNA) analyses have become a powerful tool for non-invasive biodiversity monitoring, yet the applicability of population genetic approaches to environmental samples remains largely unexplored. Even when genetic traces originate from a single individual, low target DNA concentrations and amplification or sequencing artefacts can compromise downstream genetic inferences. Here, we present a novel approach for obtaining demographic insights and lineage-level mitogenomic information from aquatic eDNA samples collected near vertebrate individuals. C_LIO_LIPaired eDNA and tissue samples were collected during sperm whale (Physeter macrocephalus) encounters in the Azores. Samples were screened for the presence of vertebrate eDNA and analyzed with a novel molecular sex identification assay. Additionally, long-range PCR was used to amplify up to five mitochondrial DNA fragments ([~]3-4k bp) before subsequent sequencing on an Oxford Nanopore Technologies platform. A stringent three-tier filtering framework capable of identifying true mitogenomic variation across eDNA samples was developed for maximum recovery of genetic diversity at the haplogroup level. By benchmarking eDNA samples via their paired tissues, parameter values were optimized to maximize concordance and minimize spurious variant calls. C_LIO_LISexing was successful for 50% of eDNA samples, with 96% concordance to paired tissues, and marine vertebrate DNA concentration significantly predicted sexing success. Further, Medaka polishing produced high identity mitochondrial consensus sequences (>16 kb) from eDNA samples. Across filtering regimes in the framework, curated SNP panels comprising up to 453 high-confidence mitochondrial SNPs resolved 19 haplogroups, with 93% concordance between eDNA and tissue samples. An intermediate bioinformatics filtering strategy maximized biologically accurate haplogroup recovery while minimizing sequencing artefacts, providing the most reliable lineage-level inferences. C_LIO_LIThis integrative approach demonstrates that targeted nuclear assays combined with long-range mitochondrial sequencing can recover individual-level genetic information from aquatic eDNA. By defining analytical thresholds governing success, the framework advances non-invasive genetic monitoring of populations via eDNA and enables population-level monitoring and conservation of endangered and genetically-vulnerable species. C_LI
Neylan, I. P.; Vaidya, R.; Dassanayake, M.; Navarrete, S. A.; Kelly, M. W.; Faircloth, B. C.
Show abstract
Tigriopus copepods are found in splash pools on all seven continents from the equator to Arctic and Antarctic regions. Given their geographic distribution, frequent exposure to extreme environmental conditions in the high intertidal zone, and strong signatures of local adaptation, these copepods have become models for exploring patterns of adaptation to stressful environments. However, most studies focus on a relatively small subset of Tigriopus species, and there are few genome resources representing the diversity of Tigriopus species and populations. Here, we combine long-read, Pacific Biosciences HiFi data with short-read, Illumina HiC and RNA-seq data to assemble and annotate a genome sequence representing a Tigriopus population from the coast of central Chile. Based on the level of divergence that we observed in mitochondrial genes, we also performed a comparison of morphological characteristics between individuals of this population and members of the T. angulatus complex. The haplotypes that we assembled (qhTigAngs1.1.hap1 & qhTigAngs1.1.hap2) are placed into 12 major scaffolds (N50 18-19 Mbp, L50 6-7), equivalent to the number of chromosomes in other Tigriopus species. BUSCO and k-mer analyses of each haplotype and BUSCO analyses of gene models are relatively complete (95-99%) with respect to gene and k-mer content. Analyses of mitochondrial data also suggest that the Chilean population of Tigriopus we sampled may represent a novel species that we call Tigriopus aff. angulatus. These genomic resources will help us understand the diversity and structure of Tigriopus species and populations as well as facilitate future comparisons of adaptation across parallel environmental gradients.
Ewers, I.; MAUVISSEAU, Q.; Jamy, M.; Rueckert, S.; Mahe, F.; Dunthorn, M. E.
Show abstract
The Leray-XT primer pair has been widely used to amplify the mitochondrial cytochrome c oxidase subunit I (COI) gene from animals. In some marine metabarcoding studies, protists have also been amplified and sequenced using these primers. Here, we ask if the Leray-XT COI primer pair is suitable for observing ciliates and radiolarians, which are numerically and ecologically important components of marine protistan communities. We show that while there are sufficient COI reference sequences for ciliates in NCBI for taxonomic assignments, there are currently only two COI reference sequences for radiolarians. Using in-silico analyses, we additionally show that while the reverse primer Leray-XT primer can bind and potentially amplify both ciliates and radiolarians, the forward primer cannot bind to either taxon. These results show that the Leray-XT primer pair is not suitable for observing ciliates and radiolarians, although it may be useful for observing other marine protistan taxa.
Gold, Z.; Robinson, K. M.; Gehman, A.-L. M.; Shea, M. M.; Lemay, M. A.; Weinrich, J.; Kellogg, C. T. E.; Clemente-Carvalho, R. B. G.; Schiebelhut, L. M.; Boehm, A. B.; Kidd, A.; Kim, A.; Hodin, J.; Dawson, M.; McAllister, S. M.
Show abstract
The sunflower sea star (Pycnopodia helianthoides) suffered a catastrophic population decline across its range from 2013 to 2017 due to the devastating Vibrio pectenicida FHCF-3 driven sea star wasting disease (SSWD) pandemic with minimal signs of population recovery. The functional extinction of this apex predator across substantial parts of its range has created a need to identify and track the remaining intact populations. Environmental DNA (eDNA) approaches provide a simple, cost-effective, and non-destructive method for monitoring occurrences, and in some cases abundances, of marine species, consistently outperforming visual occurrence monitoring efforts in sensitivity, speed, and cost. Here, we designed, developed, and validated a P. helianthoides-specific eDNA assay to identify refugia, using both quantitative and digital droplet PCR approaches. We first generated the most comprehensive sea star mitochondrial genome reference database to date (n=93 taxa, n= 15 novel). We then used unikseq and Geneious bioinformatics software to identify the unique nad5 gene region and design a highly specific hydrolysis probe-based PCR assay. We validated the performance of this assay through laboratory, mesocosm, and field testing, demonstrating a highly specific and sensitive assay. In a field application of the new assay across regions in British Columbia, Canada, we found a positive correlation between P. helianthoides eDNA concentrations and biomass density, especially when appropriately accounting for spatiotemporal integration scales (R2=0.67). The eDNA assay provides a rapid and scalable tool for monitoring the sunflower sea star which has been proposed for listing as threatened under the U.S. Endangered Species Act of 1973. Molecular tools like the one presented here enhance management and recovery efforts not only by identification and monitoring of remnant wild populations, but also by helping to assess population level response and recovery following reintroduction efforts.
Montague, T. G.; Rubino, F. A.; Gibbons, C. J.; Mungioli, T. J.; Small, S. T.; Coffing, G. C.; Kern, A. D.
Show abstract
The coleoid cephalopods (octopus, cuttlefish, and squid) are emerging model organisms for neuroscience, development, and evolutionary biology. Determining their sex early in life is critical for population management and controlled experiments. Here, we present a protocol to non-invasively determine the sex of multiple cephalopod species as young as 3 hours post-hatching using a skin swab and quantitative PCR (qPCR). We describe steps for designing qPCR primers, swabbing live animals, extracting DNA, running the qPCR, and analyzing the results. For complete details on the use and execution of this protocol, please refer to Rubino et al.1 HighlightsO_LISwab live cephalopods as early as 3 hours post-hatching C_LIO_LIExtract DNA from cephalopod skin swabs C_LIO_LIPerform qPCR-based sex determination C_LIO_LIDesign and validate qPCR primers for new species C_LI Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=190 SRC="FIGDIR/small/715692v1_ufig1.gif" ALT="Figure 1"> View larger version (43K): org.highwire.dtl.DTLVardef@3aa68dorg.highwire.dtl.DTLVardef@8c7e61org.highwire.dtl.DTLVardef@1bd45d9org.highwire.dtl.DTLVardef@134cc4d_HPS_FORMAT_FIGEXP M_FIG C_FIG